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ABSTRACT 

These four serial issues examine the effectiveness 
and appropriateness of a variety of assessment tests as well as their 
relationship to developmental education. Included are reviews of the 
following tests: (1) the .Comparative Guidance and Placement Program, 
a self-scoring test of English and mathematics; (2) the Stanford ■ 
Achievement Test, an advanced battery of tests of vocabulary, reading 
comprehension, mathematical concepts, computation and application, 
science, and social sciences; (3) the Comprehensive Test of Basic 
Skills, a test to assess proficiency in reading, language skill, 
arithmetic, science, social science, and study skills; (4) the 
Nelson-Denny Reading rest, which measures reading comprehension, 
vocabulary, and reading rate; (5) the California Achievement Test, 
measuring reading levels from grades 1 through 12 by testing reading 
comprehension and vocabulary; (6) the Sequential Test of Educational 
Progress, used to assess "higher order" intellectual skills related 
to reading, such as comprehension, inference, analysis, and 
translation; (7) the Canfield Learning Styles Inventory, a test to 
assess the affective dimensions of learning such as student 
preferences for conditions of learning, the content of learning, and 
the mode of learning as well as student expectations; and (8) the 
Kolb Learning Styles Inventory, a test to measure the cognitive 
dimensions of learning styles, such as students' use of reflective 
Observation, abstract conceptualization, and active experimentation. 
In addition, "Assessing Assessment," by Dennis Gabriel, is presented, 
which discusses the results of four surveys of educational 
institutions on assessment strategies and test use, and which calls 
for more widespread use of mandatory testing and placement of 
underprepared students in basic skills courses. (PAA) 
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About "RDE".. 



Believe it or not, prac- 
tically the entire field of 
developmental education is 
based upon research. Re- 
search activities first sug- 
gested that the "open door" 
was in danger of becoming a 
"revolving door" without 
some sort of developmental 
intervention for underpre- 
pared students. Once these 
interventions were adopted 
on college and university 
campuses, research formed a 
basis for the practice of 
individualized instruction, 
mastery-learning, self- 
concept development, diag- 
nosis, prescription, and a 
host of other activities 
carried out through devel- 
opmental programs. 

Yet the typical develop- 
mental educator does not 
have the time to review the 
existing research. While a 
wealth of research informa- 
tion relating to developmen- 
tal education is available, 
it Is often too difficult to 
find, or having been found, 
too arcane to interpret. 

It is thi s si tuati on 
that makes "Research in De- 
velopmental Education" (RDE) 
such a potentially valuable 
tool for practitioners in 
the field. RDE is designed 
to review current research 
i n areas rel ati ng to the 
practice of developmen'-.al 
education. Furthermore, it 
will attempt to interpret 
this research in terms of 
its applicability to devel- 
opmental education programs. 

ERIC 
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We plan to publish RDE at 
least five times each year. 
Each issue will be devoted 
to research on a particular 
topic of interest to devel- 
opmental educators. Each 
issue will include a review 
of- relevant research, a sum- 
ary of research findings, a 
list of suggested applica- 
tions for these findings, 
and suggested resources for 
learning more about the to- 
pic under consideration. It 
is hoped that such a publi- 
cation will offer a viable 
link between research and 
p actice. 

Of course, like any 
other publication, RDE will 
take time and money to pro- 
duce. Although it is pro- 
vided as a service to the 
field of developmental edu- 
cation, we must pay for re- 
production, printing, typ- 
i ng, and mai 1 i ng. As a re- 
sult, RDE i s being offered 



on a subscri pti on basi s . 
Our initial subscription 
rate will be $9.50 per year. 
This includes a volume of 5 
issues appearing in Septem- 
ber, November, January, 
March, and May. We will of- 
fer RDE during the 1983 cal- 
endar year and then review 
the situation in January of 
1984. If we can meet our 
costs as of that time, we 
wil 1 continue to publi sh 
and, we hope, expand the 
newsletter. If we cannot 
meet our costs, we will dis- 
continue its publication. 

Needless to say, your co- 
operation and support is ne- 
cessary if RDE is to become 
a long-term reality instead 
of a short-lived good idea. 
What can you do to help?... 
First of all, we need your 
subscription to RDE. 

We also need your advice 
and feedback. What topics 
would be of interest to de- 
velopmental educators? How 
can we improve the content 
or the lay-out of RDE? What 
sort of information should 
we provide regarding resour- 
ces? How can we improve 
di ssemi nati on of RDE? 

If you have any comments 
or suggestions in any of 
these areas, please let us 
knew. W'S will look forward 
to heari'^g frodi you (and to 
receiving your subscription 
request for the next five 
i ssues of RESEARCH IN DE- 
VELOPMENTAL EDUCATION).. 



THANK YOU 




PtACEHjENT Tests 
EvAtuATtON % Comments 



One of the activities 
most frequently encountered 
in developmental programs is 
testing designed to obtain 
information for placement. 
Cross (1976) found that over 
80% of the programs she sur- 
veyed in 1974 utilized some 
sort of standardized tests 
to place students in develop* 
mental courses. Roueche and 
Snow found similar results in 
their survey of developmental 
programs (1977). 

A 1979 survey conducted by 
the Center for Developmental 
Education suggested that two 
instruments used most fre- 
quently for placement pur- 
poses in developmental pro- 
grams were the COMPARATIVE 
GUIDANCE AND PLACEMENT PRO- 
GRAM (CGP) and the STANFORD 
ACHIEVEMENT TEST. In recent 
years, many developmental pro 
grams have also adopted the 
COMPREHENSIVE TEST OF BASIC 
SKILLS (CTBS) for placement 
purposes. 

Since these three tests 
seem to be the most widely- 
used comprehensive placement 
instruments in developmental 
programs, it is appropriate 
for the inaugural issue of 
RESEARCH IN DEVELOPMENTAL ED- 
UCATION to review their ad- 
vantages and disadvantages as 
noted in current research. 
Before looking at the indivi- 
dual tests, however, a word 
advice regarding the use of 
placement instruments may be 
in oraer. 

As Glaser (1971, p. 8) 
notes, norm-referenced place- 
ment instruments provide in- 
formation regarding students' 
". . .rel ati ve standi ng along 
a continuum of attainment... 
they tell that one student 
o is more or less proficient 
ERIC 



than another but not how pro- 
ficient either of them is 
with respect to the subject 
matter tasks involved." As 
a result, such placement in- 
struments are measures only 
of what a student can do at 
the time of testing. They 
do not measure student po- 
tential for learning nor do 
they provide precise measure- 
ment of student deficiencies. 
Placement tests, therefore, 
are best used i n sorti ng 
students into broad categor- 
ies. They are less suitable 
for predicting how well a 
given student might perform 
or for diagnosing student 
weaknesses . 

Assuming that placement 
instruments are to be used 
in sorting students for ad- 
visement and scheduling pur- 
poses, just how good are the 
tests most commonly used in 
developmental programs? The 
answer to this question can 
be found in the psychometric 
literature. 

THE CGP 

The COMPARATIVE GUIDANCE 
& PLACEMENT PROGRAM, pub- 
lished by the College En- 
trance Examination Board for 
the Educational Testing Ser- 
vice, is a self-scoring test 
of English and mathematics. 
The test includes sections 
on reading and written Eng- 
lish expression and in math- 
ematics computation, applied 
arithmetic, elementary and 
intermediate algebra. Of 
all the tests reviewed, the 
CGP has the shortest admini- 
stration time -- a maximum 
of 105 minute.^ for all six 
secti ons. 

In reviewing the CGP, Ham- 
bleton suggests that the in- 
strument's reliability has 
yet to be firmly established 
(1978) .' Reliability data 
provided in the CGP Technical 
Manual (1975) is insufficient 

4 



to establish the degree to 

which student scores on the 
instrument will vary signi- 
ficantly from one administra- 
tion to another. This po- 
tential weakness of the CGP 
is also noted by ZytoWski 
(1974). 

The validity of the CGP 
has also been questioned by 
reviewers. Hastings (1978) 
has suggested that the con- 
tent validity of the instru- 
ment is weak and as a result 
it is difficult to determine 
exactly what the test mea- 
sures. The CGP is also some- 
what lacking in predictive 
validity. Correlations be- 
tween CGP scores and later 
student performance in the 
skill areas measured by the 
test seldom exceed .40 -- 
a correlation coefficient of 
only slight significance 
(Zytowski, 1974). 

Although the CGP may have 
some limitations to its val- 
idity, these may be overcome 
by developing local norms as 
recommended by Maxwell (1979). 
Such norms, when matched a- 
gainst later student perfor- 
mance, can increase the val- 
idity of decisions based on 
CGP scores. 

When local norms are used 
in making placement decisions 
based on the CGP, the instru- 
ment can serve as a valuable 
placement tool. As Maxwell 
points out, the CGP has the 
advantage of saving "...both 
student and advisor time be- 
cause the results are imme- 
diately available to use in 
planning a program and sche- 
duling a student into appro- 
priate skills courses" (1979, 
p. 44). 

THE STANFORD TEST 

The Stanford Achievement 
test, published by the Psy- 
chological Corporation, was 
designed originally for as- 
sessment of elementary stu- 
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cision of measurement is im- 
portant. 

The best placement test- 
ing programs are likely to 
be those which strike a 
thoughtful balance between 
various testing considera- 
tions. Ease of administra- 
tion, scoring, interpreta- 
tion, advising, placement 
procedures, and level of pre- 
cision req:'ired are all im- 
portant considerations in de- 
signing ^ testing program. 
It is important to remember, 
however, that placement in- 
struments are best used when 
...aking broad categories of 
decisions regarding enroll- 
ment in a particular course 
or curriculum. When used in 
this manner, placement tests 
can be powerful tools for 
promoting academic develop- 
ment. When used for other 
purposes, they may, at best, 
be of marginal value and, at 
worst, be destructive to 
to students and instructors 
alike. 
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dents. The recent addition 
of an advanced battery pro- 
viding readings at the 7.0 
to the 9.9 grade level makts 
the instrument acceptable for 
use in basic skill assessment 
of college students. 

Jhe advanced battery in- 
cludes tests of vocabulary, 
reading comprehension, math- 
ematics concepts, computation 
and application, science and 
social science. The Stanford 
test takes considerably more 
time to administer than the 
CGP. As much as 315 minutes 
may be required to administer 
all sections of the advanced 
battery. 

Available evidence sug- 
gests that the Stanford is a 
highly reliable instrument. 
Technical data prepared by 
the publishers and reviewed 
by Ebel (1978) indicates that 
the reliability coefficients 
for most sections of the test 
are at or above the .90 level. 
This is a very high index of 
reliability and it suggests 
that little fluctuation from 
one administration of the 
test to another may be expec- 
ted in student scores. 

Insofar as validity is 
concerned, the Stanford has 
received mixed reviews al- 
though it is generally con- 
sidered sound. Lehmann (1975) 
has argued that insufficient 
data has been provided by the 
publishers to determine the 
instrument's content validity. 
This criticism is also noted 

by Ebel (1978) . On the 
Giher hand, Passow (1978) 
points out that the test's 
content validity is estab- 
li shed by the fact that 
classroom instructors were 
consulted at every step in 
the test's development and 
tha*" they assisted in de- 
signing questions and review- 
ing content. This, alone, 
should serve to support the 
content validity cf the Stan- 
ford. 



The Stanford also appears 

to possess a high degree of 
concurrent validity as it 
correlates well with other 
achievement measures. The 
various sections of the test 
also ccirrelate well with one 
another. In fact, Kasdon 
(1974) points out that the 
science and social science 
components of the test are 
so highly inter-correlated 
with other components that 
it may not be necessary to 
administer the entire bat- 
tery in order to obtain ac- 
curate information. 

THE CTBS 

The COMPREHENSIVE TEST OF 
BASIC SKILLS (CTBS) is pub- 
lished by McGraw-Hill . It 
is designed to assess stu- 
dent proficiency in the 
areas of reading, language 
skill, arithmetic, and study 
skills. The newer version 
of the test (forms T, U, and 
V) also provides a measure 
of science and social science 
skills. Level IV of the CTBS 
measures skills in the grade 
ranges of 8.5 to 12.9 making 
it quite appropriate to col- 
lege students. 

Like the Stanford test, 
the CTBS possesses a high de- 
gree of reliability. Ahmann 
(1972) reports that the reli- 
ability coefficients for sub- 
sections of the test range 

from .85 to .95 thus making 
it a highly reliable instru- 
ment . 

The CTBS also appears to 
have adequate validity -- 
D-irti cul arly content vali- 
dity. The CTBS Manual (1976) 
rovides data supporting the 
,ontent validity and concur- 
rent validity which suggest 
that the instrument is quite 
sound in these areas. In 
his review of the CTBS, Ah- 
mann agrees that the content 
validity of the test is ad- 
equate. He also notes that 



the CTBS is strongly corre- 
lated with the CALIFORNIA 
BASIC SKILLS TEST. In fact, 
he raises the question as to 
why it was necessary to pro- 
duce the CTBS at all since it 
is so highly ccrrelated with 
the California test (also 
published by McGraw-Hill). 
Putting the question of re- 
dundancy aside, however, the 
high inter-correlation be- 
tween t'"'se two instruments 
indicates that the CTBS pos- 
sesses strong concurrent 
validity 

One of the advantages of 
the CTBS is that, of all the 
tests reviewed here, the de- 
signers of the CTBS have done 
the best job of controlling 
for cultural bias. According 
to Findley (1978), bias was 
controlled for by; 1) having 
all test items reviewed by 
qualified editors to elimin- 
ate any items that might con- 
tribute to bias and, 2) eli- 
minating all items on which 
minorities scored less well 
than the standardized sample 
but which were not positively 
correlated with later perfor- 
mance. These procedures make 
the CTBS one of the least 
biased placement instruments 
on the market. 

In spite of its many dt- 
sireable qualities, the CTBS 

does have one major disadvan- 
tage. It takes a rather sub- 
stantial amount of time to 
administer. If the complete 
test battery is administered, 
m^.ximum time required may be 
as much as 335 minutes. The 
tin>e required for administra- 
tion, however, may be reduced 
by eliminating some of the 
less relevant sub-tests such 
as science and social science. 

SUMMARY & RECOMMENDATIONS 

All of the tests reviewed 
here have strengths and vyeak- 
nesses as placement measures. 
The CTBS appears to have a 



slightly greater degree of 
content validity than the 
Stanford and a considerably 
greater degree than the CGP. 
On the other hand, it is also 
the longest of the three in- 
struments taking more than 
three times as long to admin- 
ister than the CGP. 

Both the CTBS and the Stan- 
ford are highly reliable in- 
struments that will give con- 
sistent scores from one ad- 
ministration to another. The 
CGP appears to be somewhat 
less reliable than the other 
two instruments. 

The CTBS has the best con- 
trols for bias of any of the 
tests reviewed. However, the 
CGP is normed with a group 
containing a very high per- 
centage of minorities. The 
norming process of the CGP 
may, therefore, serve to re- 
duce the possibility of cul- 
tural bias in the instrument. 

All of the tests reviewed 
appear to lack predictive 
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validity. In other words, 
they may not be good predic- 
tors of actual student per- 
formance in a given class. 
This, however, is not sur- 
■pri sing. After al 1 , the 
content of the instruments 
is fixed while the content 
of courses is likely to vary. 
For this reason, the devel- 
opment of local norms and 
the correlation of these 
norms with later student 
performance is strongly rec- 
ommended . 

In spite of the amount of 
research available to judge 
the quality of various tests, 
the question "which test is 
best for placement in devel- 
opmentar courses?" cannot be 
answered simply. The "best" 
placement instrument is the 
one which best meets local 
needs. If ease of admini- 
stration and scoring is of 
primary importance, then the 
CGP may be appropriate in 
spite of its limitations. 
This is particularly true if 
local norms are developed to 
assist in placement decisions 
and if initial decisions can 
be adjusted easily. 

If precision in measure- 
ment is a primary considera- 
tion, however, the CTBS or 
the Stanford are likely to 
provide more precise infor- 
mation. The amount of time 
required to administer these 
instruments may be reduced, 
if necessary, by administer- 
ing only certain components 
of the tests. The reading. 
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vocabulary, and ntathematics 
sections of both the CTBS 
and the Stanford are both 
reliable and valid. They 
also correlate well with 
other components. Admini- 
stration time might be re- 
duced without compromising 
the quality of information 
provided by using only these 
test components. 

As a general rule, the 
more difficult it is to ad- 
just initial placement de- 
cisions, the more important 
it will be to obtain precise 
information as a result of 
placement testing. In sit- 
uations where students may 
tnove rather easily from a 
developmental course to a 
nore advanced course in the 
first few weeks (or vice 
versa), thtii incorrect de- 
cisions regarding placement 
may be less damaging. In 
situations where Initial 
placement decisions are dif- 
ficult to reverse, then pre- 

continued, page A 



Research in Developmental 

Education is published five 
times per ac:ademic year. 
Editor: Hunter R. Boylan 
Managing Editor: Doree N. 
Pitkin 

Consulting Editor: Milton 

"Bunk" Spann 
Manuscripts, news items, and 
abstracts are accepted by the 
Editor, RiDE, Center for 
Developmental Education, 
Appalachian State University, 
Boone. NC 28608. 
Subscriptions are $9.50 per 
year. North Carolina residents 
add 38 cents sales tax; 
subscribers in foreign coun- 
tries and $1.50/year shipping 
and pay by bank draft. Send 
subscriptions to Managing 
Editor, RiDE, at the same 
address or call (704)262-3057. 

1983 



eseaich in 




Volume 1 
issue S 



evelopmental education 



ERIC 



A RevicKT of Diagnostic 
Beading Tests 

By: Hunter R. Bcylan 

Practically all devel- 
opmental and learning as- 
sistance programs provide 
some sort of reading tutor- 
ing or instruction. In the 
majority of programs, this 
training in reading is pre- 
dicated on the assessment 
of students' existing rea- 
ding skills. It is from 
this assessment base that 
most reading instruction or 
tutoring proceeds. Conse- 
quently it is important for 
those who work in develcp- 

mental programs to know 

what options are available 

to them for assessment of 

student reading skills. This issue of 

BiDE is, therefore, devoted to a review of 

the reading assessment instruments most 

commcnly used in developmental programs. 

BACKGROUND AND VSEOXD 

To date, no national survey data is 
available to determiiie the types of diag- 
nostic reading tests used most commonly in 
developmental programs. However, several 
regional studies have been undertaken. 
The Arkansas Consortium for Developir»ental 
Education surveyed reading instruments 
used by practitioners in the State of 
Arkansas in 1978 (ACDE Newsletter). The 
Center for Developmental Education sur- 
veyed the diagnostic activities of North 
Carolina colleges participating in a 
regional consortium in 1979 (Center for 
Developmental Education, 1980). Perhaps 
the most important current study was 
completed by the Washington Association 
for Developmental Education under a FIPSB 
grant in 1982. Ihis survey reported data 
from 28 community colleges in the State of 
Washington and provided information on 
user reactions and recommendations for 
^ the use of various tests (WADE, 1983) 
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While these surveys are 
regional, th^ reported use 
of reading assessment tests 
was fairly consistent in 
each survey. It seems rea- 
sonable to conclude, there- 
fore, that they are some- 
what representative of 
developmental and learning 
assistance programs across 
the country. While the 
degree of this representa- 
tion is unlciown, the infor- 
mation from these studies 
at least provides a base- 
line from which to make 
informed judgements about 
reading test utilization. 



■■■"■'^^^^ Based on the available 

data, an informed judge- 
ment would suggest that only a handful of 
reading instruments are used with any 
degree of regularity in developmental and 
learning assistance programs. Ihe surveys 
reviewed included a total of 68 institu- 
tions. Among these institutions, only 5 
reading tests were used with regularity. 

These include the Nelson-Denny Reading 
Ttest and the reading sections of the Cali- 
fornia Achievement Test, the Comprehensive 
Guidance and Placxiiient Test, the Stanford 
Achievement Test ^xd the Sequential Test 
of Educational Progress. 

The Nelson-Denny Reading Test was 
the most widely used assessment in- 
strument with the reading section of the 
Con^jrehensive Guidance and Placement Test 
being the next most widely-used. The 
California Achievement Test, the Stanford 
Achievement Ttest and the Sequential Test 
of Educational Progress were all used 
with about equal regialarity. 

A cautionary note should be added here 
since the vast majority of institutions 
included in the surveys reviewed were 
community and technical colleges. Of the 



institutions responding to various sur- 
veys, 58 were two-year schools and only 11 
were four-year schools. As a result, the 
conunents made here are far more applicable 
to tw^year/iii§.titutiQnp. , \ , * 

After identifying the five-nwbst fre- 
Q|uently used reading diagnostic tests, 
user responses were analyzed fron the re- 
ports of WADE and the Center for Develop- 
mental Education (the Axicansas report did 
not include user responses). Buros' MEN- 
TAL MEASUREMENTS YEABBOCK (Eic^th Edition, 
1978) was then consulted to obtain teohni- 
nical information on these tests. 



TOE CALIFORNIA ACtttEVEMENr TEST 

The California Achievement Test, pub- 
lished by CTB/McGraw-Hill, includes a rea- 
ding component measuring vocabulary and 
comprehension. It measures reading levels 
from grades 1 through 12 and results are 
reported in raw scores, percentiles, and 
stanines. Uie reading comprehension sec- 
tion of the test is considered by review- 
ers to be a sound measure of comprehension 
Uie vocabulary section is^ however, 
somewhat less precise. Since this section 
includes only 40 words, missing one or two 
items may make a big difference in place- 
ment results. 

Ihe items included in the California 
Achievement Test are drawn from an exten- 
sive review of recommended curiculum d>- 
jectives from several state boards of edu- 
cation. 

The California Achievement Test is es- 
f^tially an achievement test for the el- 
ementary and secondary levels. Its rele- 
vance to college-level placement is depen- 
dent upon the degree to which a given in- 
stitution's curriculum is keyed to the ob- 
jectives of the public school curriculum. 

A potential prdDlem with the California 
Achievement Test is that some reviewers 
consider it to be sexually biased. A few 
of the terms used may be interpreted dif- 
ferently by women and answered incorrectly 
as a result (Lombard, 1978). This poten- 
tial difference does not seem to be re- 
flected in normative data for men and wo- 
men but it may make a slight, statistic- 
ally insignificant, difference in place- 
ment. 

BEST COPY AVAILABLE 



RpH.abilitv , 

The reliability of th^ cottprei)iensxTO 
section of this inst^ent is strong^ The 
* . yihlABh^yii|cjrte» a .wide variebf^of suidie9 
establishing! the intefnai coj^sjrttency ot 
th^meadbrl with oor^latidn doefflci&iti 
in the .65 tp .80 range. The vocabulanf 
section appears to be slightly less reli- 
able but still strong. 

Validity 

The publishers claim that it is only 
necessary to establish content validity in 
an instrument that is, essentially, a na- 
tional aciiievement test. Content validity 
lias been established by matching items ag- 
ainst recommended curriculum objectives. 
The content of test items does appear to 
match these objectives fairly well and the 
test does appear to have strong content 
validii-y. 

U ser Ccninents 

Most users seem to regard the Califor- 
nia Achievement Test as valid when used 
for general placanent purposes. It may be 
used in making initial placement decisions 
but it does not provide specific informa- 
tion for diagnostic purposes. User com- 
ments also suggest that the reading pas- 
sages may not be appropriate for older ad- 
ults. 

THE COMPRIHE^IVE GUIDANCE AND 
PLACEMENT PROGRAM 

The Comprehensive Guidance and Place- 
ment Program, better known as the OGP, is 
a self-scoring placement battery produced 
for the College Entrance Examination Board 
by ETS. Among community colleges, it is 
probably the most widely-used general 
placement battery. Ihe reading section of 
the battery is relatively brief and is 
primarily a measure of comprehension. 

While the OGP has the advantage of be- 
ing quick and easy to administer, it is 
not a particularly precise instrimient. 
The precision of the instrum^t may be im- 
proved, however, by gathering local data 
for normative purposes. Local norms may 
then be developed to establish cut-off 
points for placement in the college cur- 
riculim (Maxwell, 1978). 

The CGP was originally developed for 



community college placement. The basic 
construct of the CGP is that there is an 
identifiable body of skills associated 
with college-level work and that these 
skills can be identified, and items canbe 
developed that will match these skills. 

Since the level of academic skill re- 
quired for success varies dramatically 
from one college to another, it is doubt- 
ful that the CGP can accomplish this 
purpose without local normative data. 
With such data the CGP is probably a rea- 
sonably valid general placement instru- 
ment. It also has the advantage of being 
keyed to what the publishers consider to 
be college-level academic competencies 
rather than to high school-level competen- 
cies. 

Reliability 

The reliability of the reading section 
of the CGP is not particularly strong. 
Correlation coefficients for internal con- 
sistency range from .35 to .60 and the 
reading section's correlation with other 
sections of the test is also low. 

Validity 

The publishers attempted to establish 
content validity by having test items re- 
viewed by community college faculty. 
The purpose of this review was to deter- 
mine whether or not the items were consis- 
tent with the community college curricu- 
lum. The CGP is one of the few instru- 
ments in which actual community college 
faculty were involved in validation of 
test items. 

Ihe publishers have attempted to estab- 
lish the predictive validity of the in- 
strument by correlating OGP results with 
student grades. T^ie correlations were ex- 
tremely low (ranging from .30 to .40). 
The predictive validity of theCGP may, 
however, be improved by using local norms 
for prediction purposes (Euros, 1978). 



User C annents 

Users tend to like the CGP because it 
is quick and easy to score. It does allow 
for faster testing of a larger number of 
students than most other instruments. 
Furthermore, since the test may be self- 
,^^-scored, there is no need to wait for com- 



puter processing of test results. Stu- 
dents can obtain their test scores almost 
immediately after taking the test. This 
immediate feedback is certainly an advan- 
tage. 

Most users are aware of the instru- 
ment's shortcomings as a placement device. 
Users consistently suggest that it be used 
only as a very generalized placement mea- 
sure and that placement decisions be made 
on the basis of local norms. 

THE NELSON-DENNY READING TEST 

The Nelson-Denny Reading Test is the 
most widely-used instrument among those 
responding to the surveys reviewed. Max- 
well (1978) suggests that it may also be 
the instrument most widely used by 4-year 
colleges and universities for initial 
placement purposes. Hie publishers claim 
that the test is designed to provide a 
trustworthy ranking of student ability in 
the areas of reading axnprehension, vocab- 
ulary development, and reading rate, ihe 
construct used in designing the test was 
that normative data frcan college students 
will yield a valid measure of how well 
such students ought to be able to read at 
various college grade-level ec livalents. 
Grade level reports, therefore, are based 
on normative data rather than any particu- 
lar curriculum objectives. Reviewers 
generally suggest that this construct is 
valid and that the test does accomplish 
the purposes for which it was designed 
(Forsyth, 1978 and Cummins, 1981). 



'■.ne Nelson Denny is well regarded for 
college placement purposes because it is 
designed for college students and it is 
relatively quick and easy to administer, 
the most recent form of the test (forms E 
and F) also allows self-scoring. 



Ihe instrument measures reading skills 
at the level of 9th through the 16th 
grades. However, owing to limitations in 
the norming sample, scores from grades 13 
to 16 are less reliable than scores for 
the lower levels. Grade level placement 
is also questionable because only a few 
items will make a difference of one or 
more grade levels. Raw scores, stanines 
and percentiles are also reported on the 
Nelson-Denny and these may be better in- 
dicators of students" actual performance. 



Reliability 

The Nelson-Denny Reading Test tends to 
have fairly high reliability for its voc- 
abulary and comprehension sections. Ihe 
range of reliability scores reported for 
these sections are .82 to .91 and .68 to 
redding rate assessment is considerably 
Icwer, ranging fran .54 to .66. 

VaUdity 

Ihe Nelson Denny Reading Test has been 
validated primarily througi-i norm-referenc- 
ing. Itie publishers provide a variety of 
of technical data from studies conducted 
with appropriate samples of college stud- 
ents taken at various points in the aca- 
demic year. The instrument has also been 
reviewed for content validity by curricu- 
lum experts. The instrument appears, on 
the basis of these tests, to be quite 
valid when used as reccxnmended. 

User Conneaits 

In general, users are quite satisfied 
with the quality of information yielded by 
the Nelson-Denny as long as the results 
are used for general placement purposes. 
While the Nelson-Denny is often used as a 
diagnostic instrument, it does not provide 
sufficiently precise data to diagnose spe- 
cific strengths and weaknesses. Users 
also caution against relying on alterna- 
tive forms of the Nelson-Denny for pre and. 
post-test investigation of reading gciins. 
Apparently, pre- and post-test results 
cire heavi ly inf luenced by test f ami lia ri ty 
(Cuwnins, 1981). 



THE SEQUENTIAL TEST OF 
HXXATIONAL PRCX3RESS 

The Sequential Test of Educational 
Progress (STEP) is produced by the Educa- 
tional Testing Seirvice and published by 
the Addison-Wesley Company. Unlike other 
reading instruments reviewed here^ the 
STEP is designed to assess so-called 
"higher order" intellectual skills such as 
comprehension, inference, analysis, and 
translation (Wardropr 1978). Items in the 
reading section of the test aie selected 
on the basis of how well they measure 
these constructs. Because of the impor- 
tance of sudh skills in the col lege- level 
curriculum, the STEP is generally reoarded 
o as a good placement iiistrument for college 
ERIC 



students althoughitwasoriginally based 
on elementary and secondary stiviencs. 

The STEP reading section includes 60 
items, 30 to test vocabulary and 30 to 
test ownprehcaision. Iftif ortunately, only a 
combined scorefrom both sectionsis 
provided. As a result, the STEP is diffi- 
cult to use for diagnostic purposes 
(Johnson, 1978). This is unfortunate 
since more specific information on higher 
order intellectual skill strengths and 
weaknesses would be quite valuable to 
developmental educators. 

Itie S1EP reports scores in grade level 
equivalents, ranging from grades 4 to 14, 
It also reports stanine and percentile 
scores. Johnson (1978) notes that, for 
reasons as yet unknown, women tend to out- 
perform men on the reading section of the 
STEP by as much as one score point. 

Reliability 

Reliability tests for the STEP were 
conducted using measures of internal 
consistency and analysis of test-retest 
reliability using alternative test forms. 
In both areas, the STEP demonstrated a 
high degree of reliability. The most 
recently reported ranges for reliability 
were between .76 and .93 (Johnson, 1978). 

Validity 

The test developers established con- 
tent validity through expe'.t review of the 
items to insure that they were related to 
the higher order skills being assessed. 
No data has yet been provided to establish 
the construct validity of the instrument. 

User Ccninents 

Because the vocabulary and comprehen- 
sion scores are combined in the STEP, 
users are frequently di satisfied with the 
utility of results. Furthermore, since 
the test is based on a set of theoretical 
constructs rather than specific curricu- 
lum objer-tiveSf it haslittle predictive 
validity. Ihe STEP may be more useful as 
a measure of intellectual skills than as a 
placement or diagnostic instrument. 

THE STANFORD AOUFVEMENr TEST 

The Stanford Achievement Test, pub- 
lished by the Psychological Corporation, 
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is one of the earliest and most reliable 
reading assessment instruments in exis- 
tence. The test was originally designed 
for the assessment of elementary school 
students and, for many years, the cut-off 
point foi measurement was at the 6th grade 
level. More recent editions of the test 
measure skills up to grade level 9.9. 
•This comparatively low cut-off range makes 
it appropriate only for basic skills stu- 
dents at the college level. 

The Stanford Adiievement Test measures 
vocabulary, comprehension, and word attack 
skills, lest items used to measure these 
skills are based on analysis of school 
textbooks, analysis of curriculum objec- 
tives, and expert review. Since the test 
is keyed to the public sdxx)l curriculum, 
its relevance to college level placement 
is questionable. The Stanford Achieve- 
ment Itest is much like the California Ach- 
ievement Testin this regard. Glass sug- 
gests that tiiere is really little differ- 
ence between the two instruments (1978). 

Itte Stanford Achievement Test provides 
grade equivalent scores, percentile ranks, 
ind stanines. TS^se scores are "on^f 9^ 
a substantial population including 275,000 
sdKX)l children from 43 differs : states. 
AS a result, the normative dat.. for this 
instrument is probably the sponge st of 
any of the reading tests reviev^ here. 



Reliability 

Since the Stanford Aduevement Test has 
been in existence for so many years and 
has been so widely-used in the public 
schools, reliabiUty has been oontiriually 
analyzed and improved- It is probably one 
of the most reliable instruments on the 
market with most reliability coefficients 
ranging fron the .85 to .95, 



Validity 

Like the California Achievement Test, 
the Stanford is designed as a national 
measure of school achievement. As a re- 
sult, the publishers claim that the only 
appropriate form of vaUdity for assessing 
the instrument is content validity. Over 
the years, a variety of testing experts 
have analytzed this instrument and 
^ carefully verified this content validity. 

ERIC 



User Ccitinents 

Most users consider the Stanford Ach- 
ievement Test to be useful for <3&rieTa.l 
placement of underprepared students, ims 
is particularly true for those who are re- 
cent high school graduates. Because of 
its strong elementary school content, how- 
ever, its use with adults may be quesion- 
able. Similarly, most users indicate that 
the instrument has relatively Uttle pre- 
dictive validity. 



SUNMARy 

Of the tests reviewed here, the Nelson- 
Denny Reading Test appears to be most ap- 
plicable to college level placement. It 
is valid, reliable, and easy to use. It 
is normed specifically for use with col- 
lege level students and it reports vocab- 
ulary, comprehension, and reading rate 
scores. 

The Nelson-Denny does, however, have 
its limitations. Its measurement at the 
upper grade levels is less reliable than 
at the lower grade levels. This is fur- 
ther complicated by the fact that only a 
few items will make a difference between 
placement in one grade level or another. 
Also, the Nelson-Denny does not provide 
sufficient information to make it usable 
for specific diagnosis of reading prob- 
lems. It appears to be best used as a 
g^eralized plaoaonent instrument. 

There may, however, be excellent rea- 
sons for using a placement instrument 
other than the Nelson-Denny. The Nelson- 
Denny is si^fically a reading test. It 
does not provide placement data in other 
subject areas. Instruments such as the 
CGP or the CaUf omia Achievement Test do 
provide a complete battery of tests for 
placement purposes. Also, the Nelson - 
Denny may be too difficult for underpre- 
pared students. Other instruments provide 
much broader placement ranges that may be 
more suitable to severely underprepared 
students. 

Perhaps the most important point to be 
made here is that none of the tests re- 
viewed are particularly useful as diag- 
nostic instruments. Ihey are used best as 
pre-screening devices to give practition- 
ers general information on where to begi 
iji working with develcjinental students. 

1 o 
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Reviewing Learning Styles Inventories 
The Canfield and Kolb Liis 
By Hunter R. Boy Ian 



One of the more important trends in 
developmental education is the increasing 
use of learning styles assessment. Educa- 
tors have long known that individual stu- 
dents learn in different ways and at dif- 
ferent rates, and a variety of teaching 
syste."".5: have been developed to accomodate 
individuality in learning. Some of the 
most notable techniques for individuali- 
zing instruction at the college level 
include Holland and Skinner's "Programmed 
Instruction" (1961), Keller's "Persona- 
lized System of Instruction" (1968) and 
Pcstlethwait's "Audio-Tutorial" Instruc- 
tion" (1969). 

All of these systems accomodate 
irjdivid'jal rates of learning and make some 
prevision to offer different kinds of 
learning experiences. They did not, how- 
ever, accomodate individual styles of 
learning. A major reason for this 'was 
that, until fairly recently, there were 
no instruments available for assessing 
vJhi ch students' learning styles might be 
best servtd by any given type of instruc- 
tion. 

However, the relatively recent 
(ierftAopmcnt of several different typ^s of 
leorm.n^ jtyled measures has solved this 
problem to 6eme extent. Using learning 
$tt)l^^ a^sfissnemtr, we are now able to 
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ieterrwune^ with so«e dejrce of accuracy, ^ 
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which students learn best from which in- 
structional techniques. This issue of 
RESEARCH in DEVELOPMENTAL EDUCATION (RiDE) 
is devoted to a review of two of the more 
popular learning styles measures: the 
Canfield Learning Styles Inventory and the 
Kolb Learning Styles Inventory. 

Hie Canfield LSI 

The Canfield LSI was originally 
developed in 1972 in order to "...measure 
some of those affective variables that 
seem to affect learning, and which 
contribute to satisfactory and effective 
adjustment to the teaching-learning 
situation" (Canfield, 1980 p. 1). It 
should be noted that the Canfield 
instrument is the only_prie_on_ the mar)^et 
that emphasizes the 

of^l^earning as opposed ""to' the ^soJnitiArJ' 
QaifflSQsLon^^ 

The Canfield LSI measures four 
dimensions of student preference in 
learning situtations. These include the 
conditions of learning, the cxDntent of 
learning, the mode of learning, and 
student expectations in a learning 
situation. 

Under the category of conditions of 
learning, the Canfield LSI measures 

student preferences for: 

1. Affiliation - pleasant* friendly, 
and warm relations with other students or 
with faculty; 

2. Structure - orderly, logical, arsd 
«reLl-define^ goals, objectives, and stinfy 
plans; 
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3. Achievement - independence, self- 
determined goals and objectives in 
relation to perceived skills and 
interests; and 

4. Eminence - competition, knowledge 
of one's own performance in relation to 
otheif' &r need for control or Authority. 

Under the category of Contentr the 
Canfleld LSI measures etudent preferences 
•for working with with various sorts of 
content. These content sub-categories 
include: numeric, qualitative (working 
^ith v.'crdU or language), inanimate (wor- 
king with t*\ings)r and working with 
people. 

Canfield agrees with Gagne's notion 
cf "channel efficiency," the idea that in 
every individual^ some channels of 
perceiving and processing information are 
Psore effective than others (1967). As a 
re^vAtf his instrument also measures 
9fcucients* peferred mode of learning. The 
cetegorics included under this heading 
are: listening, reading, iconics 
(Learning through illustrations, movies, 
slidesr graphs, and pictures, etc.), and 
direct experience. 

Finally, the Canfield LSI assesses 
Students expectations of learning - i.e., 
their anticipated level of performance. 
The levels of anticipation include out- 
Standing or Superior performance, good or 
above -average performance, average or 
satisfactory performance, and below-av- 
erage er unsatisfactory performance. 

The instrumeat measures these cate- 
gories tiirough 30 items in which students 
are asked to rank order their preferences 
among fo^ir choices. The structure of the 
questions requires that students make a 
'•forced choice" in responding. For in- 
stance, item 19 in the Canfield LSI re- 
cuires respondents to rank order to fol- 
icwi!:g as a means to learn new material: 
U Wearing a lecture, 2) reading a book 
or -kcjct, 3) viewing q movie or slides, 
oC A) e^qperi men ting with a small sample* 

Administration of the Canfield LSI 
^e^serally -takes 30 to 45 minutes. The 
inventt?ry is designed to be self -scoring 
dItViough thewanual suggests -that "extra 
cau:tiois should be taken to assure an un- 
AerstenAttwj of hfi)W -th* drawer? *jre-tc> be 
recorded on the separate answer sheefe" 



(Canfield, 1980, p. 9). The inventory 
package includes the test bookleti answer 
sheet, and a chart for use in plotting 
one's learning preferences. 

The original version of the Canfield 
LSI has been criticized because of the 
reading level of certain items. Ihis was 
considered to m>ake the test less valid for 
use with developmental students. A 
revised version of the Canfield LSI was 
developed in 1981. The revisions on this 
fonn of the Canfield LSI make it much more 
appropriate for use with developmental 
students or any other group with poor 
reading skills. 

Reliability 

Reliability for the Canfield TiSI was 
established through the use of item 
analysis, split-half reliability tests, 
and inter-scale correlations. Each of 
.these tests suggested that the Canfield 
LSI is a highly reliable instrument. As 
is shown in TABLE I; the split-half 
reliabilities were exceptionally high. 

Validity 

In addition to expert review to 
establish content validity, a number of. 
studies were undertaken to determine the 
degree to which students with different 
majors and backgrounds obtained different 
score patterns on the Canfield LSI. 
According to the test manual, "...several 
statistically significant differences were 
found between all pairs of the following 
groups: 

1. 52 criminal justice students. 

2. 208 business students. 

3. 108 education stiidents. 

4. 63 physical therapy students. 

5. 42 physical therapy faculty. 

Additional findings from other studies 
verifying significant score differences 
between different groups of students are 
also reported in the technical manual. 

The Xblb LSI 

Unlika the Canfield instrument, the 
Kelb 131 measures cognitive, rarher than 
affective, dimensions of learning stylts. 

Kolb ISX v»Si derived from Kolb's 
'•theory of experiential learning" (1984). 
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Kolb conceptualizes the learning process 
as a series of responses designed to re-- 
solve conflicts among four styles of lear- 
ning. These four styles include: 

1. Ccficxete Experienoe - the use of 
sensing and feeling to acquire new infor- 
iraticn ; 

2. Reflective Observation - watch- 
ing and thinking about things in crder to 
learn; 

5. Abstract Conceptualization - ob- 
taining information as abstractions and 
ti^n actively processing new learning; and 

4« Active Elxpeprimentation - doing 
something with new information or material 
in order to learn it. 

In the Kolb LSI, these four styles of 
learning are assessed through the use of a 
questionnaire. The original questionnaire 
included nine item.s. As in the Canfield 
LSI/ respondents are asked to make a 
•forced choice among alternatives in each 
question. Unlike the Canfield LSI which 
asks respondents for rank-order preferen- 
ceSf the Kolb provides a series of words 
and asks respondents to indicate which 
words best describe them. For example, 
the first item on the Kolb LSI asks re- 
spondents to indicate which of the fol- 
lowing words are m.cst cr least descriptive 
«rf their learning style: 1) discrimina- 
ting^ 2) tentative, 3) involved,. 4) 
practical. 

The publishers of the Kolb LSI (McBer 
and Co.) have recently revised the origi- 
nal inventory. The *'iSI 1985'' includes 
several changes including expansion in the 
numrier of items from nine to twelve and a 
sentence completion rather than a word 
choice format. The original version of 
^^he Kolb LSI, like the Canfield, was also 
criticized for the reading level of its 
test items. The new version is wri-tten in 
much simpler language than the original. 
The revised version also includes a 'more 
simplified scoring format* 

Perhaps the most important ravisions 
in -the l§ft5 version 3r« the improved 
reliability of the instrument and the 
development of a more representative 
normative sample. The orLginal it^stninient 
hoavily criticized because of its low 
-tts+-rete€t reliability. Freecinvairi aiui 
StiMKpl aSWI state tha^ ''Test-retest 
^ iability f oC the two sall^pLes after only 



^three weeits was rather low, (median = .50) 
$uggesting that the LSI is rattier vola- 
tile" (p- 446). 

The 1985 version has *ls© I'^en nornf^ 
with a TCiuds wider sample than the original 
version. The current edition of the LSI 
is based on a norrr.ative group including 
various ethnic groups , occupational 
fields, and income and educational levels. 
According to the revised technical speci- 
ficationsr the normative group had an 
"•••average education of two years in 
college" (19«5r P- 

The Kolb "LSI 85" can be administered 
to most groups in about half an hour* The 
instrument is packaged as a booklet which 
includes a description of the inventory, 
the inventory questions, instructions for 
self-scoring,, and an explanation of the 
scores plus a grid for plotting one's 
learning style. 

Validity 

There has been considerable debate 
among psychometrists as to the validity of 
the Kolb LSI. The items in the inventory 
were selected to be consistent with Kolb's 
experiential learning theory. The items 
ware reviewed by Kolb and others to insure 
content validity - at least according to 
the theory of experiential xi£:arning. Much 
of the instrument's validity, therefore, 
is dependent upon the accuracy of Kolb's 
theory. 

As Kolb correctly notes, however, 
"Learning styles represent preferences for 
one mode of adaptation over others; but 
these preferences do not operate to the 
exclusion of other adaptive modes and will 
vary from time to time and situation to 
situation" (1982^ p. 4). As a resuxt, the 
Kolb LSI is simply a straightforward, 
self-reporting mechanism designed to pro- 
mote recognition of the complexity of 
individual approaches to learning and to 
provide a quick assessment of an indivi- 
dual's preferences at a particular point 
in time. Since it is not designed to as- 
sess fixed traits of individuals, stai^dard 
techniqacQ f-or assessing validity may not 
be applicable to the kolb LSI- 

Bftiiability 

Since toib'e trheon^ assumes that 
Li^ividiual preferences will change ^nd 
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that the development of a learning style 
represents an adaptive process, some 
standard methods of assessing reliability 
may not be appropriate for the Kolb 
inventory. Nevertheless, it should be 
noted that the test-retest reliability of 
the original version was fairly low 
(Fresdr.an and Stumpf, 1989). 

The new version of the Kolb LSI has 
been checked for internal reliability 
using Cronbach's Standardized Scale Alpha. 
The resulting coefficients are all in the 
range of .70 to .90. Split-half reliabil- 
ity has also been assessed along with the 
correlation between the revised inventory 
and the original inventory. The reliabil- 
ity coefficients all proved to be signifi- 
cant at the .01 level with most of them 
being in the .80 to .90 range. The re- 
sults from these assessments are presented 
in TABLE II. 

SumMJcy Corments 

In making a decision whether or not 
to use learning style inventories as part 
of an assessment processr practitioners 
should recognize that knowledge of lear- 
ning styles is at a stage of relative 
infantry. It can be fairly well estab- 
lished that learning styles do exist. 

Jt car. also be established that the 
learning style inventories discussed here 
do bear some relationship to subjective 
reality. In other words, those who take 
the inventories usually find them to be 
reflective to som.e degree of tlieir actual 
learning preferences. 

Unfortunately, no measure of human 
attributes is completely accurate. 
Insofar as learning style inventories are 
concerned r they m.ay be less accurate ^ than 
oth.er mea£:ures simply because our research 
knowledge of learning styles is still 
sor;.ewhat limited. Nevertheless, they do 
appear to measure personal learhing 
preferences rather consistently and with 
jjome degree of accura "v. They may be 
ifnpr^cise but they do seem to have some 
validity. 

Given this, -there are several potan- 
^ialAy Vdlid uses -for learning styles 
Vn>fefsto]r^*es. The^ used \o deter- 

mine ^tudknt |>refercnces -for learning 
sotivitnee. The Confteidi LSI can disc be 
H^seA -to ^terfnine how students like to 

o 
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have courses organized and deli.vered, what 
subjects students prefer to study, and how 
well students expect to perform academi- 
cally. 

Learning style inventories can be 
useful, therefore, as part of a pre-as- 
sessment program to determine the courses 
students should take and the types of 
instruction which see*n co have the most 
potential for success. They can also be 
used by instructors to determine how their 
courses may best be organized and deli- 
vered in order to maximize student lear- 
ning. 

Learning style inventories* are simply 
tools which can be used well or poorly. 
They can be useful for a variety of pur- 
poses. They should improve cur ability to 
deliver appropriate instruction and to 
improve the quality of learning among 
students. They are not, however, thor- 
oughly accurate measures of fixed human 
traits. 
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TABLE I 

Split Half Reliabilities for the Canfield LSI 

N = 1597 



Scale First Half Odd Numbered Versus 

Versus Second Half Even Numbered Items 



Ar filiation i peer j 


Q 1 


Q •? 


Structure { organization ) 


Q fi 


Q 7 


Acnievement igoal setting; 


Q 1 


• y / 


4 M •«> 1 m Q ^ T ^ T /^r^ 1 

Ln\inQnce i compeui uiun / 


• o 




ArniiaLion vinsrrucuor; 


• JO 


Q 7 




. 97 


.38 


Anh "i p vf^mp n 1" f indeo^^ndence ) 


. 97 


.98 




. 98 


.98 


burner 1 c 




• ^ o 


> U a i Ju u a L X V e 


. 98 


. 9 9 


I nanina ue 


Q fi 

• «7 O 


Q R 


reopie 




QR 


Listening 




Q 7 


Reading 


. 99 


.99 


Iconic 


.98 


.98 


Direct Experience 


. 96 


.96 


Expectancy (A) 


. 98 


.99 


Expectancy ( B) 


. 97 


.97 


Expectancy (C) 


. 98 


.99 


Expectancy (D) 


.99 


.99 



Source: LEAENJWO STALES IMVEKfTORV HflWUAL CCanficld, L960). 
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TABLE II 

Split Half Reliabilities of the Kolb LSI 

N = 268 



Category 



Split-Half Reliability Correlation Between 
(Spearman-Brown) 1976 and 1985 Editions 



Concrete Experience 



Reflective Observation 



.81 



.71 



.89 
.87 



Abstract Concepuali zation 



Active Experimentation 



.84 

.83 



.92 
.92 



Abstract /Concrete 



Active/Reflective 



.85 
.82 



.92 
.93 



Source: LEARNING STYLE INVENTORY FOR 1985 
TIONS, Boston: McBer & Co., 1985. 
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Assessment Perspective 

Perhaps because half of all new college students can 
be classified as academically iinderprepared, writers 
have assessed the extent of the problem variously. 
Abraham (1986) reported 30%, Skinner and Carter 
(1987), 40%; Lutz (1979), 43%; Bray (1983), half; 
O'Banion (1988), half; and Haase and Caffrey (1983), 
60%, Responding to the reality of those estimates, coU 
leges assess students' skills at matriculation and attempt 
to place students into **mathemagenically"" appropriate 
reading, writing, and mathematics classes. Woods (1985) 
reported that an American College Testing and American 
Association of Community and Junior Colleges study 
found over 90% of two-year colleges offer assessment. 
Although I^derman, Ribaudo, and Ryzewic's, (1983) na- 
tional study found that 97% of all colleges offer assess- 
ment, their study. Skinner and Carter's (1987) Texas 
study, and Rounds and Andersons (1984) California 
study confirmed that fewer than two-thirds of all col- 
leges require assessment, as shown in Table 1. 

Table 1 

Percent of Colleges with Mandatory Testing* 



California 


TexBs* * 


National 




2 


69 (91% of the 76% that 






offer basic reading) 


>Xriiiiip 5() 


I 


65 l85% of the 77% that 






offer basic writing) 


Math 25 


•> 


64 (86% of the 74% that 






offer basic math) 



*C!t)llpges may not assess all students: The student may already have 
a dejiree, be a transfer student, or ha\p taken an admissions test 
in liiMi tif placement (Skinner A (larten 1987). 

**For the Texas study, colleges testing 100% of new students were 
defined as having mandatory testing. 

L'nderstandably, there is uncertainty concerning the 
scope and scale of assessment practices. For instance, 
to say that 90% of all colleges offer assessment is 
misleading since at least one-third of all colleges do not 
have mandatory assessment. For example, in Texas where 



the percentage of assessed new students ranged from 8 
to 100%, the typical school assessed 43% of new stu lents 
in reading, 40% in writing, and 45% in math. Only three 
colleges did not assess the basic skills, yet fewer than 
two percent of Texas' two-year schools mandate assess- 
ment (Skinner & Carter, 1987). Our knowledge of prac- 
tices in assessment testing ulso suffers from other com- 
plications and distractions such as myriad cutoff scores 
on tests and sparse empirical justification for assessment 
tests. 

Assessment Instruments and Strategies 

As shown in Table 2, two state, one area, and a na- 
tional study examined the diversity of assessment tests: 
Rounds and Andersons (1984) survey of California com- 
munity colleges; a Texas study of two-year colleges (Skin- 
ner & Carter, 1987): Southern Regional Educational 
Board's (SREB) survey (Abraham, 1986) which yield- 
ed 100 combinations of tests; and a national study 
(Lederman, Ribaudo, & Ryzewic, 1983) of 1,269 colleges. 

Discussion of surveys 

Expectedly, the surveys show that the ND is the most 
common reading test; beyond that, the surveys 
demonstrate state and regional interests. For instance, 
the Coop-Reading, which was used in 7 out of 99 Califor- 
nia colleges, was not mentioned in the SREB study. The 
SAT, on the other hand, was used once for reading assess- 
ment among the 99 California colleges. 

The research makes a weak case for reading tests' use 
and ability to predict success in classes. First, the ND 
was not without criticism. A timed lest such as the ND 
is not appropriate for developmental students (Clary, 
1973; Kerstiens, 1986a, 1986b). Due to ND cutoff scores 
as low as grade 7, Abraham (1986) found coll?g3 level 
to be meaningless; the lowest cutoff level allows 99% of 
all students to enter college-level classes. Second, scores 
do not correlate well with students' success. Santa Rosa 
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Table 2 

Reported Frequency* of Assessment Ibsts** 
California Texas Southern Area National*** 



Reading 



N 


Test 


N 


Test 


N 


Test 


N 


Test 


18 


ND 


24 


ND 


121 


ND 


236 


ND 


9 


CGP 


13 


ASSET 


89 


ACT-Misc. 


152 


Local test 


7 


Coop- Reading 


7 


DTLS 


37 


DTLS 


81 


ACT 


6 


Davis 


4 


SDRT 


35 


SAT-Verbal 


55 


SAT-Verbal 


5 


[^ocal test 


4 


DAT 


29 


ASSET 


47 


CGP 


14 


Other 


7 


Other 


62 


Other 


106 


Other 



N Test 

64 Essay 

18 Local test 

10 Coop-Eng./Rd. 

9 CGP 

8 ND 

32 Other 



N Test 



Writing/English 

N Test 



N Test 



13 ASSET 

9 DTLS 

6 ND 

3 TSWE 

3 WEEP 

9 Other 



108 ACT-Misc. 

66 Local test 

57 Essay 

53 TSWE 

28 SAT-Verbal 

91 Other 



365 Local test 

127 ACT 

98 SAT-Verbal 

96 Essay 

62 TSWE 

84 Other 



N Test 



N Test 



Mathematics 

N Test 



N Test 



26 Local test 
10 SCAT-Math 

9 CGP 

3 Coop-Math 

6 Other 
NA 



14 DTMS 
13 ASSET 
MAA 
DAT 
1 Other 
NA 



4 



118 
97 
85 
47 
29 
40 



Local test 
ACT-Misc. 
DTMS-Misc. 
SAT-Math 
State test 
Other 



393 Local test 

115 ACT 
78 SAT-Math 
40 State test 
40 CGP 
NA 



Junior College (1984) reported that reading tests did not 
predict success in any course. Yamagishi and Gillmore 
(1980) studied the ND and a combination of test scores 
and writing samples upon academic success and found 
no effective predictor of academic success. They con- 
cluded the ND may lack predictive validity. 

Although local writing tests were widely used, the 
studies again demonstrate state and regional interests. 
For example, 10 California colleges used the Coop- 
English or Reading, but the Coop was not listed among 
the 30 writing tests in the SREB study. 

The research shows that colleges have problems with 
writing assesoment. First, colleges may use questionable 
assessment devices. Six Texas and eight California col- 
leges used the ND— a reading test— to place students in 
writing classes: the Davis and Coop-Reading were aiso 
used to place students in writing classes in California. 
Second, writing samples alone are not adequate for iden- 
tifying basic-level students (Gordon, 1987). When two 



Legend 

American College Testing (ACT) 
Assessment of Skills for Successful Entry and Transfer 
(ASSET) 

Comparative Guidance and Placement Battery 

(MAPS-CGP) 
Cooperative (Coop) 

Cooperative School College Ability Test (SCAT) 

Descriptive Test of Language Skills (MAPS— DTLS) 

Descriptive Test of Math Skills (MAPS-DTMS) 

Differential Aptitude Tests (DAT) 

Mathematical Association of America (MAA) 

Multiple Assessment Programs and Services (MAPS) 

Nelson Denny (ND) 

Scholastic Aptitude Test (SAT) 

Stanford Diagnostic Reading Test (SDRT) 

Test of Standard Writwn English (TSWE) 

Ifritten English Expression Placement Test (WEEP) 



♦Ah Gordon (1987), Guerrero and Robuuon (1986), and Olson and Martin (1980) noted, a single assessment is not adequate for placement; therefore, 
some colleges give tests in combination. Thus, percentages can be confusing. 

**The fact that a college offers a lest does not mean that the test is mandatory. Alsoi. mandatory placement does not always follow mandatory testing. 
•♦♦For the national study, only those tests used by at least 20 colleges were included ui the statistics. 



instructors read the same papers, Guerrero and Robin- 
son (1986) reported that instructors disagreed on course 
placement 43% of the time. Even when writing sampler 
are used with objective scores, Olson and Martin (1980) 
found only 39% of the students received the same place- 
ment recommendation. Third, there is a lack of correla- 
tion between writing samples and achievement (Alex- 
ander & Swartz, 1982). 

Local math tests were widely used, but, again, stale 
and regional differences abounded. Widely used in the 
state and SREB studies, ASSET, SCAT, and DTMS were 
each used in fewer than three percent of the colleges 
in the national study. 

Selecting cutoff points and proving that math testing 
works presented challenges to colleges. Abraham (1986) 
cited evidence that cutoff scores ranged from 1 to 18 
on the DTMS: at the lower level all but 14% of the 
students could be placed in college-level classes. There 
is no correlation between placement scores and final 
mathematics grades (Sworder, 1986). 

Even widely used and commonly cited admissions tests 
are questionable placement tools. Morante (1987) cited 
the New Jersey Basic Skills Council's findings that many 
students who had scored above-average on the SAT still 
were not ready for college-level classes. Grulick (1986) 
reported mixed findings after a review of the SAT as an 
assessment tool and argued for local tests. 

High-school grades 

In nontest evaluations, high-school grades are used 
in 43% of the cases for reading, 46% for writing, and 
56% for mathematics (Lederman, Ribaudo, & Ryzewic, 
1983). The use of high-school grades for placement is 
not appropriaie (Morante, 1987) since grades provide 
an inflated view of students' abilities (Roueche, Baker, 
& Roueche, 1986), and the proficiency required to finish 
three years of high-school English or mathematics is con- 
siderably lower than the level expected of most college 
freshmen (Edge, 1979). 

Summary 

The research demonstrates regional test selection dif- 
ferences but little evidence to prove the effectiveness of 
assestn testing or agreement as to what constitutes 
college level. With cutoff scores as low as seventh grade, 
r-^Ue, e students resemble Lake Wobegon s children: they 
are all above average. Perhaps the best assessment ap- 
proach is a combiiiation of several measures of each basic 
skill plus a professional analysis of the results, and on 
this point, the literature is weak. 

Making Assessment Work 

Nevertheless, several inferences about successful 
assessment can be drawn. One characteristic of sue 
cessful assessment is a plan to meet the local assessment 



challenge. Lederman, Ribaudo, & Ryzewic (1985) prof- 
fered entry-level testing, prescriptions, and exit testing. 
Bray (1983) explained California s Learning Assessment 
Retention Consortium's plans to develop assessment 
philosophies, establish goals and objectives, set up assess- 
ment centers, select tests, and assess students. 

Second, mandatory testing and placement are essen- 
tial, but as noted in Table 1, at best only two-thirds of 
college students face mandatory assessment and place- 
ment. Lum and Alfred (1987) reported that students in 
compulsory programs were more likely to be persisters 
and perform well on long-term achievement measures 
than students in volimtary programs. For L300 students, 
Richards (1986) found that 73% followed placement ad- 
vice and succeeded in the recommended classes; 15% 
followed advice and did not succeed; 6% neither followed 
advice nor succeeded; and 6% did not follow advice but 
succeeded. Roueche, Baker, & Roueche (1986) reported 
that the majority of colleges favored mandatory assess- 
ment but were not strongly in favor of mandatoiy place- 
ment: they conclude that assessment is futile without 
mandatory placement. 

Third, in addition to testing basic skills, colleges 
should consider other instruments that survey students* 
study habits and academic confidence^ Accordingly, Bliss 
and Mueller (1987) discussed a promising assessment 
device, the Study Behavior ln\entovy, which measures 
short-term and long-term study behaviors as well as 
academic self-perception. Scores on this instrument cor- 
relate with CPA at .79. 

Fourth, good programs use technology to improve 
assessment and research. Rounds, Kanten and Blumin 
(1987) note ACT is designing new components for 
computer-adaptive testing. ETS' computer-adaptive test 
(Computerized Placement Test) has reading, sentence 
skills, mathematical reasoning, and algebra components, 
in the technologically ideal assessment, correct answers 
trigger more difficult questions, and incorrect answers 
trigger easier questions. Thus, frustration, instructional, 
and independent levels can quickly be determined. With 
the new technology, Wainer (1983) noted that students 
can be tested at any time, results are instantaneous, test 
security is improved, students can work at their own 
pace, there are no problems with ^wer sheets, but there 
is a tendency to rely on one instrument for placement. 
Al the University of California. Irvine (Shoemaker. St. 
John. & Lewis. 1986). computers report means and 
percentages of students placed in courses and reliabili- 
ty and item analyses of placement tests. UCI tests in 
chemistry, mathematics, reading.- writing, and ESL. 

Proof thai Assessment Works 
The use of end-of-quarter grades to demonstrate 
assessment effectiveness requires more thought. Rasor 
and Powell (1984) reported no correlation between test 
scores and final grades for three out of four classes. 



Sworder (1936) found no correlation between placement 
scores and final math and English grades. Palmer (1987) 
concluded that most studies found only a low correla- 
tion between assessment tests and grades. The San Diego 
G)mmunity G)Uege District study (19&3) foimd a signifi- 
cant relationship between test scores and grades in 
English lOI only. However, Morante (1987) noted that 
the correlation between assessment scores and grades 
should be close to zero in a good remedial program. 

The best evidence in favor of mandatory assessment 
and placement comes from Roueche and Baker (1986) 
who noted that students at Miami-Dade who took sug- 
gested developmental classes had a nine times better 
chance of graduating than students who declined to take 
developmental classes. 

Conclusions 

Because of myriad cut-off points on placement tests, 
college level is ambiguous. Nevertheless, half of all col- 
lege students require developmental work. Although 
assessment testing is widespread, there is no reason to 
conclude that as many as two-thirds of all colleges have 
mandatory testing and mandatory placement for reading, 
writing, and mathematics. The use of a single placement 
device, such as high-school grades, writing samples, or 
admissions scores, provides an inaccurate view of 
students' abilities. A good approach to assessment in- 
volves a planned program, mandatory tests for each of 
the basic skills, mandatory placement, use of current 
technology, program assessment, a survey of study habits 
and attitudes, and dissemination of testing information. 
Proof that tests or writing samples predict academic suc- 
cess remains sparse. Empirical evidence that assessment 
works is not to be found in the correlation between place- 
ment scores and Hnal grades (the correlation is close 
to zero it) many cases). However, if mandatory assess- 
ment and mandatory placement can improve a student's 
chances of graduation by a factor of nine— as Roueche 
and Baker^s (1986) review of Miami-Dade experiences 
suggests— then mandatory assessment and mandatory 
placement are essential. 
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